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Some  Conservative  Statistical  Approaches  for  Presenting 
Interlaboratory  N.D.A.  Enrichment  Measurements 

By  Clifford  Spiegelman 

SECTION  1.  Introduction. 

Realistic  evaluation  of  Interlaboratory  comparisons  of  measurements  on 
prototype  and  primary  standards  is  important  for  establishing  uniformity  of 
reported  results.  The  emphasis  in  this  paper  is  on  multichannel  analyzer 
count  measurements,  such  as  those  obtained  from  gamma  spectroscopy.  These 
measurements  and  other  nondestructive  assay  (N.D.A.)  measurements  are  used  to 
measure  the  enrichment  (percentage)  of  special  materials,  such  as  11303,  in  the 
standards.  The  measurements  are  based  on  an  estimate  of  the  total  number  of 
counts  from  the  special  material.  This  estimate  is  called  the  peak  area 
calculation.  There  is  no  widespread  agreement  on  the  correct  formulas  to 
compute  peak  area,  see  [1].  Thus  when  different  laboratories  measure 
prototype  standards  (and  hence  possibly  different  formulas  are  used  for 
computing  peak  area)  it  is  difficult  to  distinguish  differences  among  the 
standards  from  superficial  differences  caused  by  variations  in  the  peak  area 
formulas  used  by  different  laboratories,  see  [1]. 

The  central  issue  to  be  considered  is  whether  or  not  M proposed  standards 
have  enrichments  sufficiently  similar  to  be  certified  with  the  same 
enrichment  level.  Several  statistical  tests  of  similarity  are  considered.  For 
two  of  these  tests,  algorithms  are  provided  for  finding  linear  peak  area 
formulas  which  make  the  standards  appear  either  most  or  least  homogeneous. 
These  two  tests  are  the  usual  F-statistic  and  the  standardized  range.  This 
report  provides  specific  procedures  for  finding  the  linear  peak  area  formulas 
that  are  best  and  worst  in  the  sense  that  they  make  the  prototype  standards 
look  most  or  least  homogeneous. 
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The  measurements  are  a vector  of  counts.  Each  entry  in  the  vector  is 
called  a channel.  A channel  contains  an  Integer  count  of  the  number  of 
particles  at  nearly  a fixed  energy  level  striking  the  counter.  Elevated 
counts  in  certain  channels  Indicate  the  presence  of  particular  radioisotopes. 
However,  due  to  residue  from  past  analysis  and  imperfect  counter  resolution, 
adjustments  for  "background"  must  be  made  to  the  raw  counts.  Adjusted  counts 
are  used  to  provide  more  accurate  estimates  of  the  amounts  of  radioisotopes 
present.  One  of  these  adjustments  is  the  peak  area  computation. 

Two  components  contribute  to  the  peak  area  computation.  One  component  is 
counts  from  the  target  radioisotope,  the  other  is  background.  The  peak  area 
calculation  attempts  to  find  the  shaded  area  in  Figure  1.  This  area  is 
estimated  by  subtracting  an  estimate  of  the  area  due  to  background  (shown  by 
dots)  from  the  estimate  of  total  area  in  the  peak  region.  The  area  due  to 
background  is  estimated  from  counts  in  adjacent  "background  region"  channels. 
The  estimates  of  area  we  consider  here  are  linear  and  are  described  in  detail 
later  in  this  section.  These  are  the  most  common  type  of  area  estimates,  see 
[1].  Some  notation  is  given  below. 

Let  Ij  denote  the  number  of  counts  in  channel  j from  the  target 
radioisotope. 

Let  3j  denote  the  number  of  background  counts  in  channel  j.  Let  Jj  denote 
the  set  of  indices  of  the  channels  in  the  peak  region  and  J2  denote  the  set  of 
indices  of  the  channels  in  the  purely  background  region.  Without  loss  of 
generality  take  = {1,  ...,  j^}  and  J2  = (ji  + 1,  •••,  jz^*  (Assume  these 
regions  are  given  and  fixed  throughout  the  remainder  of  this  paper.)  Typically 
and  J2  are  not  overlapping.  We  assume  that  there  are  M proposed 
standards  measured  by  each  laboratory.  Since  different  multichannel  analyzers 
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may  have  different  characteristics  such  as  different  geometrical  construction 
it  does  not  make  sense  to  seek  a fixed  peak  area  formula  for  all  laboratories. 
Instead,  different  laboratories'  measurements  will  be  dealt  with  individually. 
All  the  notation  that  follows  is  for  a single  laboratory's  measurements. 


The  observables  are: 


j e Jl 

j e J2 

The  peak  area  (PA)  calculations  we  consider  are  linear.  Specifically: 


PA 


- 1 

JeJ^ 


-.1 

JeJ2 


“ji'j 


Since  the  weights  Wj=0  for  j=l,  2,  ...,  22  would  produce  a zero  peak  area 

for  all  standards  a constraint  must  be  placed  on  the  weights.  Otherwise  one 

may  get  the  trivial  vector  = (wi,  ...»  w-;  ) = (0,  0,  ...,  0)  for  the 

J2 

weights  that  make  the  standards  appear  most  homogeneous.  The  constraint 
used  here  is 

I"j  = 1 


which  is  standard  in  numerical  quadrature.  Since  the  statistics  used  in  the 
next  section  are  scale  invariant  the  value,  1,  serves  only  to  rule  out  the 
vector  W = 0.  In  addition  it  should  be  noted  that  nearly  all  peak  area  rules 
can  be  written  as  [W(y)]ty  where  the  vector  W(y)  is  some  (often  nonlinear) 
function  of  (yj^,  ...,  yj^)  = yt,  see  [1]. 

Replicate  measurements  are  usually  made  for  each  standard.  The  resulting  peak 
areas  are  then  compared  by  an  analysis  of  variance  procedure,  see  [2].  Let 
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denote  the  mean  peak  area  for  the  k-th  standard.  Let  y denote  the 
average  value  of  over  all  the  M standards.  One  possible  measure  of 

M _ 

homogeneity  for  the  M standards  is  A2=^(yj^-y)2,  if  a is  small,  the  standards 

1 

can  be  considered  homogeneous. 

One  Important  pair  of  hypotheses  is: 

Hq:  a < 6 

vs. 

Hi:  A > 6 

for  some  appropriate  6. 

Other  measures  may  also  be  considered,  such  as  one  based  on  the  range. 

Let:  R = Max  yj^  - Min  yj^  . 

k k 

Then  another  pair  of  hypotheses  is : 

H'q:  R < 6" 

vs. 

H'l:  R > 6' 

It  is  not  likely  that  any  set  of  M standards  can  contain  identical 
enrichments.  Therefore,  it  may  not  be  necessary  to  consider 
hypotheses  that  specify  Identical  enrichments.  We  take  6 and  6' 
to  be  strictly  positive. 

The  choice  of  6 and  6'  cannot  be  given  by  any  a priori  rule.  The 
laboratory  making  the  measurements  will  have  to  choose  6 and  6".  If  the 
standards  do  satisfy  Hq  or  they  must  be  similar  enough  to  fulfill  their 
intended  purpose,  see  Section  4. 
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SECTION  2.  Some  test  statistics  for  the  hypotheses. 

Let  PA(m,n)  denote  the  peak  area  of  the  n-th 

replicate  of  the  m-th  standard,  m = 1,  ...»  M,  n = 1,  ...»  N.  Let 

PA  = )[  PA(m,n)/M*N  and  PA(m)  = 'I  PA(m,n)/N 
m,n  n 

Then  a good  test  (the  uniformly  most  powerful  invariant  unbiased  test  if  peak 
areas  are  assumed  to  be  independent  and  normal)  of  Hq 
vs.  is  given  by  using  the  noncentral  F statistic 

F(A/o2,  M-1,  M(N-D) 

nJ]  (l^(m)  - PA) 2 
' m 

I (PA(m,n)  - PA(m))2 
m,n 

This  statistic  has  noncentrality  parameter  A/o2,  and  M-1  and  M(N-l)  degrees 
of  freedom. 

Any  test  of  the  form 

Reject  Hq  when  F > k 
do  not  Reject  Hq  when  F < k 

is  a good  test.  The  constant  k is  chosen  to  provide  a desired  level  of 

significance.  In  applications  M(N-l)  6/^(PA(m,n)-PA)2  can  be  used  as 

m,n 

an  estimate  of  the  noncentrality  parameter.  The  constant  k can  be  chosen 
from  the  tables  in  [3]. 
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A statistic  for  testing  H^q  vs.  is  based  on  the  range  of  the 
averages  PA(m).  Let  M*  and  M*  denote  the  indexes  corresponding  to  the 
largest  and  smallest  values  of  PA(m).  If  there  are  two  or  more  indexes 
corresponding  to  the  largest  or  smallest  value  an  arbitrary  choice  may  be  made 

A 

among  them.  Let  the  range  of  the  statistics  PA(m)  be  denoted  by  R,  i.e., 

A 

R = PA(M*)  - PA(M*).  We  take  as  an  estimate  for  the  variance  of  R, 

2 ^(PA(m,n)  - PA(m))2 
^2  m,n 

a = 

M(N-l) 

By  taking  this  choice  we  Ignore  the  possibility  that  M*  and  M*  do  not 
correspond  to  the  populations  with  the  largest  and  smallest  values  of  y, 
respectively.  No  universally  better  choice  is  known  by  the  author. 

Then  a statistic  for  testing  H"q  vs.  is: 

(R-6^) 

T = - 

In  many  applications  there  are  hundreds  of  counts  in  each  channel.  Therefore 
the  probability  distribution  of  these  counts  can  be  well  approximated  by  a 
normal  distribution,  see  [4].  In  these  cases  T has  approximately  the 
distribution  of  student's  t with  M(N-l)  degrees  of  freedom,  see  also  [5]. 

The  test  rejects  H'q  if 
T > K' 

and  does  not  reject  H^q  if 


T < K 
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The  constant  is  obtained  from  any  t table.  It  should  be  noticed  that  the 
indexes  M*  and  M*  may  not  correspond  to  the  standards  with  the  largest  and 
smallest  theoretical  mean  level  y.  Therefore,  the  constant  K'’  may  have  to  be 
modified,  see  [6]. 

Notice  that  if  either  hypothesis  Hq  or  is  rejected  then  multiple 
comparison  methods  should  be  applied  to  find  the  "bad  actors"  among  the  group 
of  standards,  see  [2].  An  example  Illustrating  the  use  of  these  tests  is 
given  in  Section  4. 

The  next  section  deals  with  the  interaction  between  the  weights  W and  the 
test  statistics. 

SECTION  3.  Finding  extreme  peak  areas. 

It  is  clear  that  the  choice  of  weights,  W,  used  in  defining  peak  area 
affects  the  values  of  F and  T.  It  would  be  wonderful  if  we  could  plot  F and  T 
as  a function  of  W.  However,  22*  dimension  of  W,  is  usually  much  larger 
than  2,  see  Section  4.  This  implies  that  these  plots  are  not  practical. 

Instead  we  settle  for  the  comparison  of  a few  choices  of  weights.  The 
most  important  choices  are  those  that  the  measurement  laboratory  believes 
produce  meaningful  and  accurate  peak  areas,  see  [1].  In  addition  it  may  be 
useful  to  compute  the  set  of  weights  which  make  the  statistics  F and  T take  on 
their  largest  and  smallest  values.  Procedures  for  calculating  these  weights 
are  given  below. 

The  weights  for  the  F statistic  are  provided  first.  Let  denote  the 

counts  in  the  j~th  channel  of  the  n-th  replicate  of  the  m-th  standard, 
j e Ji+J2»  l<m<M,  l<n<N.  The  usual  * notation  is  used  to  denote  an 
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average.  The  average  over  any  set  of  subscripts  is  denoted  by  replacing 

the  subscript(s)  with  a doty  e.g.j  ~ ^ * 

nm  nm 

Let  the  matrix  A^,  where  a superscript  t denotes  transpose,  have  its  jm-th 
entry,  aj^,  given  by  the  equation  aj„j  = * This  matrix  has 

dimensions  22  x M.  In  addition  let  the  matrix  (j2  x NM)  have  entries  whose 

( (m-1 )N+n)-th  column  is: 

^Imn  “ 

• • • 

Tj^mn  “ ^12“* 

Recall  that  = (wj^,  ...,  Wj2)»  The  F statistic  can  now  be  rewritten  as: 

F = N W^AtAW 


The  weights  which  maximize  and  minimize  F subject  to  ^ w-;  = 1 are 
characterized  below.  If  the  elements  of  B are  noirmal  and  n > 2 then  B^B 
has  full  rank  j2*  (The  assumption  of  a normal  distribution  is  only  an 
approximation  to  the  truth.  There  remains  a very  small  chance  that  B^B  is  not 
of  full  rank.  If  the  rank  of  B^B  is  not  j2,  then  the  rest  of  the  analysis  in 
this  section  cannot  be  done.)  It  is  known  that  B^B  can  be  rewritten  as  the 
square  of  a full  rank  symmetric  matrix,  Q,  i.e.,  B^B  = Q^,  see  [7]. 

Let  u “ Q W (notice  that  p does  not  denote  any  mean  value).  Then  F 
satisfies  the  following  equations: 
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F » NWtQ(Q”lAtAQ”l)QW/uty 

I 

= NM*^(Q“^A^AQ”^)li/litM  . 

The  constraint  =1  can  be  rewritten  as : 
jeJl 

Q”^  d = 1 where  d is  a vector  having  ones  in  the  first  and  zeroes  in  the 
last  (j2”ji)  places,  i.e., 


dt  = (1  . . . 1 0 . . . 0) 


Jl 


J2 


It  is  known  that  the  p vectors  which  maximize  and  minimize 

M Q~  ^ A*^  AQ“  1 ) p/ p t y 

are  the  eigenvectors  corresponding  to  the  largest  and  smallest  eigenvalues 
of  Q”^A^AQ“1  respectively,  (see  [7]). 

Let  v(l),  VO2)  denote  an  orthonormal  set  of  characteristic 

vectors  (eigenvectors)  for  this  matrix  (see  [7]). 

The  eigenvectors  V^j)  corresponding  to  largest  and  smallest 

eigenvalues  satisfy 

v(i)Q-ld  t 0 
vO)Q”ld  * 0 

with  probability  one  and  are  the  solution,  i.e.,  they  maximize  and 
minimize  F.  (We  note  that  the  Independent  eigenvectors  need  only  be 


orthogonal,  and  need  not  be  orthonormal.  The  F statistic  is  invariant  to 
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scale  changes.)  The  untransformed  vectors  v(i),  vO)  are  only  of  intermediate 
interest.  The  final  choice  of  weights  is  obtained  from 

W = Q"lv  . 

The  study  of  these  eigenvectors  (weights)  may  be  of  independent  interest. 
This  is  particularly  true  for  the  weights  which  minimize  the  F statistic. 

Our  analysis  is  similar  to  principal  components  analysis,  see  [7].  In 
particular  if  identical  standards  are  analyzed  by  the  above  analysis  then  the 
eigenvectors  help  to  characterize  the  measurement  process. 


Next  the  weights  for  the  T statistic  are  found.  We  use  the  matrix  Q 
defined  in  the  last  derivation.  Consider  the  M(M-l)  vectors  Dmm"*  = 


^-1 


Ylm-  - Yim'. 


^12“’  ~ ^^2“  ‘ 


for  all  m and  m^.  Let  p = Q W as  before.  Notice  that  T 


is  proportional  to  max 


mm 


Then  the  vector  y which  maximizes  T is 


(y^y) 


the  vector  of  greatest  length.  (Recall  that  the  constraint 

= 1 serves  only  to  rule  out  the  null  vector.) 

J 

1 


The  weights  that  minimize  T satisfy  the  following  quadratic 
programming  problem: 

Min  Z 
W 

such  that  for  all  m and  ra' 

^^mm^  ^ Z and 


y*^y  = 1 where 


y = QW. 
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Discussion. 

The  largest  and  smallest  values  of  a set  of  F and  T statistics  do  not  have 
F and  t distributions.  The  distributions  for  the  F statistic  case  are  given 
in  [7J  under  Hq  with  6=0.  The  distributions  for  the  T statistic  case  have 
yet  to  be  worked  out,  and  according  to  M.A.H.  Dempster  L8J  distributions  for 
stochastic  programming  problems  are  complicated. 


SECTION  4.  An  F and  T Statistic  Example. 

The  National  Bureau  of  Standards  (NBS)  is  participating  in  a certification 
program  for  0303  low  enrichment  (<3X)  standards.  At  the  time  of  this  writing 
NBS  has  only  analyzed  prototype  standards.  There  are  three  replicate 
measurements  for  each  of  twelve  prototype  standards.  The  three  measurements 
for  one  peak  and  its  backround  region,  for  one  of  the  samples,  are  shown  in 
Figure  2.  The  peak  area  was  computed  using  the  sum  rule,  i.e.,  * W2  =*  ... 

= Wj^=  .1  and  = Wj^+2  ® •••  * ^^2“  where  22  ~ The  value  of 

the  F statistic  for  this  peak  area  formula  is  shown  in  Table  1 along  with 
the  individual  means  PA(m)  for  each  of  the  prototype  standards.  The  value  of 
the  T statistic  is  63.7.  These  statistics  correspond  to  real  differences 
among  the  standards  of  about  (10”^)%  enrichment.  On  the  basis  of  both 
statistics  we  can  say  the  prototype  standards  are  different  but  good  enough 
for  their  intended  use. 

The  weights  corresponding  to  the  extreme  values  of  the  F and  T statistics 


were  not  computed.  However  it  is  conjectured  that  the  weights  corresponding 
to  the  maximum  of  these  statistics  will  give  large  weight  to  anomalies  in  the 
data.  It  is  also  conjectured  that  if  the  prototype  standards  are  nearly  equal 


12 


then  the  weights  which  minimize  the  statistics  should  give  appropriate  peak 
area  rules. 

SECTION  5.  Additional  Considerations. 

In  addition  to  the  T and  F other  statistics  may  be  worth  computing.  These 
include  rank  tests,  see  [9].  For  testing  Hq'*  vs.  Hi'  the  maximum  likelihood 
ratio  statistic  is  known  to  be  good  when  N is  large. 

The  minimizing  and  maximizing  weights  for  the  T statistic  may  not  be  the 
only  weights  of  interest.  For  exploratory  purposes  it  will  be  useful  to 
choose  weights  orthogonal  to  the  previously  chosen  ones,  which  maximize  the  T 
statistic.  Finally  it  may  be  worthwhile  to  track  the  optimum  weights  over 
time,  since  they  are  directly  related  to  the  necessary  background  correction. 

SECTION  6.  Conclusion  and  Summary. 

Two  important  statistics,  F and  T,  were  given  for  testing  the  sets  of 
hypotheses  (Hq  vs.  H]^)  and  (Hq'  vs.  These  hypotheses  are  related 

directly  to  two  possible  measures  of  the  homogeneity  of  the  standards.  For 
these  statistics  procedures  were  given  for  finding  weights  which  make  the 
standards  look  most  and  least  homogeneous.  The  F and  T statistics  from  these 
weights  can  be  used  as  a basis  for  conservative  decisions  about  the  standards. 
For  example,  the  minimum  value  found  for  the  T statistic  is  at  least  as  small 
as  the  value  of  T that  would  be  found  using  a "proper"  choice  of  peak  area 
weights.  Thus  if  the  minimum  value  of  T is  large,  one  can  be  sure  that  no 
linear  peak  area  formula  could  reduce  an  apparent  inhomogeneity  of  standards. 


Acknowledgement:  I thank  J.  R.  Rosenblatt  for  many  helpful  suggestions. 

The  data  in  Section  4 were  jointly  analyzed  with  D.  Hogben. 
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